CS50 Week 4 Memory¶

计算机的内存。

像素¶

将一张图片放大很多倍，会看到一个一个的像素。一个像素可以由 RGB 格式来表示，Red、Green 和 Blue 分别各占用 1 个字节，即 8 bits。\(2^8=256\)，因此 RGB 在十进制下的范围是 0-255。

白色：R：255，G：255，B：255，或#FFFFFF。
红色，R：255，G：0，B：0, 或#FF0000。

十六进制（Hexadecimal）¶

十六进制下，用 16 个 digits 来表示数字：

Text Only

0 1 2 3 4 5 6 7 8 9 A B C D E F

用十六进制的好处：

在十进制下，要想表示 255，需要用到 3 个符号来表示（2、5、5）；
在十六进制下，要想表示 255，只需要用两个符号FF。因为\(16\times15+1\times15=255\)。
因此十六进制可以节约屏幕上的空间！
但十六进制并不能解决计算机的内存，因为它们最终都将转换为二进制（0 和 1）才能被计算机识别。

本质上，十六进制下的每一个符号，都能代表二进制下的 4 个比特。

为了避免混淆十六进制和十进制，我们通常用0x作为前缀来表示十六进制。

地址和指针¶

指针是一个变量，它储存了一个变量的地址。

C

#include <stdio.h>

int main(void)
{
    int n = 50;
    //定义指针变量。
    int *p = &n;
    printf("%p\n", p);
}

每次运行上面的程序，打印出的地址都不一样，说明变量被储存在不同的位置。

*除了乘法之外，还有两个功能：

在定义指针变量值，要用int *，例如int *p = &n;
Dereference，也就是获取这个地址上储存的变量的值。

例如：

C

#include <stdio.h>

int main(void)
{
    int n = 50;
    int *p = &n;
    printf("%p\n", p);
    //打印指针 p 储存的地址上的变量值。
    printf("%i\n", *p);
 }
 ```
 会输出：

0x7ffda0a4767c 50 ```

不管指针变量是指向int、string或是float等等，指针变量总是占据 8 个字节（64 bits）。

字符串¶

对于字符串的常规理解：

字符串是字符的序列。

我们可以这样定义字符串：

C

#include <cs50.h>
#include <stdio.h>

int main(void)
{
    //需要引入 cs50.h，才能用下面的语句来声明字符串。
    string s = "HI!";
    //如果不用 cs50.h，那么可以用下面的语句来声明字符串。
    char s[] ="HI!";
    printf("%s\n", s);
}

实际上，s也是字符串的第一个字符的地址。因此，我们也可以这样定义字符串：

C

char *s = "HI!";

严格来说，C本来是没有字符串这种数据类型的，它本质上一直是首字符的地址。

例如，下面的代码：

C

#include <stdio.h>

int main(void)
{
    char *s = "HI!";
    printf("%p\n", s);
    printf("%p\n", &s[0]);
    printf("%p\n", &s[1]);
    printf("%p\n", &s[2]);
    printf("%p\n", &s[3]);
}

会输出：

Bash

$ make address
$ ./address
0x402004
0x402004
0x402005
0x402006
0x402007

指针运算¶

s[1]是一个 syntactic sugar，它的本质是*(s + 1)。因此，下面的两段代码的作用是一样的：

使用s[1]

C

#include <stdio.h>

int main(void)
{
    char *s = "HI!";
    printf("%c\n", s[0]);
    printf("%c\n", s[1]);
    printf("%c\n", s[2]);
}

使用*(s+1)

C

#include <stdio.h>

int main(void)
{
    char *s = "HI!";
    printf("%c\n", *s);
    printf("%c\n", *(s + 1));
    printf("%c\n", *(s + 2));
}

都会输出：

Bash

$ make address
$ ./address
H
I
!

上面的代码用到了指针的加法，我们如何对地址做加减呢？代码中的s + 1、s + 2是什么意思？

实际上，我们不需要考虑数组中的元素类型到底占用多少个字节，只需要用s + 1就可以变化到下一个元素所在的地址。

比较和复制¶

比较¶

我们想要比较两个字符串是否相同。有下面的程序：

C

#include <cs50.h>
#include <stdio.h>

int main(void)
{
    char *s = get_string("s: ");
    char *t = get_string("t: ");

    if (s == t)
    {
        printf("Same\n");
    }
    else
    {
        printf("Different\n");
    }
}

会输出：

Bash

$ make compare
$ ./compare
s: HI!
t: BYE!
Different
$ ./compare
s: HI!
t: HI!
Different

可见，输入两个相同的字符串，输出的仍然是“Different”。这是因为我们在比较 s 和 t，即if (s == t)的时候，判断的是两个地址是否相同。

修改思路：用strcmp。正确的程序：

C

#include <cs50.h>
#include <stdio.h>
#include <string.h>

int main(void)
{
    char *s = get_string("s: ");
    char *t = get_string("t: ");

    if (strcmp(s, t) == 0)
    {
        printf("Same\n");
    }
    else
    {
        printf("Different\n");
    }
}

输出：

Bash

$ make compare
$ ./compare
s: HI!
t: HI!
Same

复制¶

C

#include <cs50.h>
#include <ctype.h>
#include <stdio.h>
#include <string.h>

int main(void)
{
    string s = get_string("s: ");

    string t = s;

    t[0] = toupper(t[0]);

    printf("s: %s\n", s);
    printf("t: %s\n", t);
}

会输出：

Bash

$ make copy
$ ./copy
s: hi!
s: HI!
s: HI!

我们只把 t 变成了大写，为什么 s 也会变成大写呢？

这是因为 s 和 t 在本质上都是指针，它们指向的是同一个地址。当改变 t 时，本质上是改变了 t 指向的地址上的值。而 s 也指向这个地址，因此把 s 指向的地址上的值打印出来的时候，也是大写的。

内存分配¶

为了解决上一个问题，可以用另一个思路：先分配一段内存给 t，再一个字符一个字符地将 s 中的字符赋值到 t 中。

C

#include <cs50.h>
#include <ctype.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(void)
{
    char *s = get_string("s: ");

    //分配内存（和s一样大）。
    char *t = malloc(strlen(s) + 1);

    for (int i = 0, n = strlen(s) + 1; i < n; i++)
    {
        t[i] = s[i];
    }

    t[0] = toupper(t[0]);

    printf("s: %s\n", s);
    printf("t: %s\n", t);
}

或者调用strcpy函数：

C

#include <cs50.h>
#include <ctype.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(void)
{
    char *s = get_string("s: ");

    char *t = malloc(strlen(s) + 1);

    strcpy(t, s);

    t[0] = toupper(t[0]);

    printf("s: %s\n", s);
    printf("t: %s\n", t);

    //释放内存。
    free(t);
}

运行结果如下。现在可以只对 t 进行大写变化，而不改变 s。

Bash

$ make copy
$ ./copy
s: hi!
s: hi!
t: Hi!

valgrind——内存管理¶

下面的代码有一个错误，它一共只有 3 个整型的内存大小，但不是从x[0]开始赋值的，因此x[3]实际上会造成内存溢出。

C

#include <stdio.h>
#include <stdlib.h>

int main(void)
{
    int *x = malloc(3 * sizeof(int));
    x[1] = 72;
    x[2] = 73;
    x[3] = 33;
}

但是编译和运行都没有报错：

Bash

$ make memory
$ ./memory
$

valgrind是一个命令行工具，它可以用来检查是否有内存问题。

Invalid write of size 4提示我们有内存溢出问题。

Bash

$ valgrind ./memory
==5902== Memcheck, a memory error detector
==5902== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==5902== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright info
==5902== Command: ./memory
==5902== 
==5902== Invalid write of size 4
==5902==    at 0x401162: main (memory.c:9)
==5902==  Address 0x4bd604c is 0 bytes after a block of size 12 alloc'd
==5902==    at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==5902==    by 0x401141: main (memory.c:6)
==5902== 
==5902== 
==5902== HEAP SUMMARY:
==5902==     in use at exit: 12 bytes in 1 blocks
==5902==   total heap usage: 1 allocs, 0 frees, 12 bytes allocated
==5902== 
==5902== 12 bytes in 1 blocks are definitely lost in loss record 1 of 1
==5902==    at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==5902==    by 0x401141: main (memory.c:6)
==5902== 
==5902== LEAK SUMMARY:
==5902==    definitely lost: 12 bytes in 1 blocks
==5902==    indirectly lost: 0 bytes in 0 blocks
==5902==      possibly lost: 0 bytes in 0 blocks
==5902==    still reachable: 0 bytes in 0 blocks
==5902==         suppressed: 0 bytes in 0 blocks
==5902== 
==5902== For lists of detected and suppressed errors, rerun with: -s
==5902== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 0 from 0)

垃圾值¶

不对一个数组进行初始化，将数组的值打印出来，会看到一些奇怪的数字。

C

#include <stdio.h>
#include <stdlib.h>

int main(void)
{
    int scores[3];
    for (int i = 0; i < 3; i++)
    {
        printf("%i\n", scores[i]);
    }
}

Bash

$ make garbage
$ ./garbage
68476128
32765
0

这些奇怪的数字就是 garbage values。这可能是计算机在运行之前的程序时遗留下来的。

互换¶

下面的代码并不能将 x 和 y 互换：

C

#include <stdio.h>

void swap(int a, int b);

int main(void)
{
    int x = 1;
    int y = 2;

    printf("x is %i, y is %i\n", x, y);
    swap(x, y);
    printf("x is %i, y is %i\n", x, y);
}

void swap(int a, int b)
{
    int tmp = a;
    a = b;
    b = tmp;
}

运行结果：

Bash

$ make swap
$ ./swap
x is 1, y is 2
x is 1, y is 2

这是因为swap函数做的运算并不影响 x 和 y 的值。

可以使用指针来解决这个问题：

C

#include <stdio.h>

void swap(int *a, int *b);

int main(void)
{
    int x = 1;
    int y = 2;

    printf("x is %i, y is %i\n", x, y);
    //注意，传入的参数是x和y的地址。
    swap(&x, &y);
    printf("x is %i, y is %i\n", x, y);
}

void swap(int *a, int *b)
{
    //a是一个指针变量，下面一行代码的作用是：先去往a指向的地址，把地址上的值赋值给tmp。现在tmp的值是1。
    int tmp = *a;
    //b是一个指针变量，下面一行代码的作用是：先去往b指向的地址，把地址上的值赋值给a指向的地址。现在，a指向的地址上的值变成2了。
    *a = *b;
    //最后把tmp赋值给b指向的地址。现在b指向的地址上的值变成1了。
    *b = tmp;
}

运行结果：

Bash

$ make swap
$ ./swap
x is 1, y is 2
x is 2, y is 1

CS50 Week 4 Memory¶

像素¶

十六进制（Hexadecimal）¶

地址和指针¶

字符串¶

指针运算¶

比较和复制¶

比较¶

复制¶

内存分配¶

valgrind——内存管理¶

垃圾值¶

互换¶

评论